You're out of free questions.

Upgrade now

You are a renowned thief who has recently switched from stealing precious metals to stealing cakes because of the insane profit margins. You end up hitting the jackpot, breaking into the world's largest privately owned stock of cakes—the vault of the Queen of England.

While Queen Elizabeth has a limited number of types of cake, she has an unlimited supply of each type.

Each type of cake has a weight and a value, stored in a tuple with two indices:

  1. An integer representing the weight of the cake in kilograms
  2. An integer representing the monetary value of the cake in British shillings

For example:

  # Weighs 7 kilograms and has a value of 160 shillings
(7, 160)

# Weighs 3 kilograms and has a value of 90 shillings
(3, 90)

You brought a duffel bag that can hold limited weight, and you want to make off with the most valuable haul possible.

Write a function max_duffel_bag_value() that takes a list of cake type tuples and a weight capacity, and returns the maximum monetary value the duffel bag can hold.

For example:

  cake_tuples = [(7, 160), (3, 90), (2, 15)]
capacity    = 20

# Returns 555 (6 of the middle type of cake and 1 of the last type of cake)
max_duffel_bag_value(cake_tuples, capacity)

Weights and values may be any non-negative integer. Yes, it's weird to think about cakes that weigh nothing or duffel bags that can't hold anything. But we're not just super mastermind criminals—we're also meticulous about keeping our algorithms flexible and comprehensive.

Gotchas

Does your function work if the duffel bag's weight capacity is 0 kg?

Does your function work if any of the cakes weigh 0 kg? Think about a cake whose weight and value are both 0.

We can do this in O(nk)O(n*k) time and O(k)O(k) space, where nn is the number of types of cakes and kk is the duffel bag's capacity!

Breakdown

The brute force approach is to try every combination of cakes, but that would take a really long time—you'd surely be captured.

What if we just look at the cake with the highest value?

We could keep putting the cake with the highest value into our duffel bag until adding one more would go over our weight capacity. Then we could look at the cake with the second highest value, and so on until we find a cake that’s not too heavy to add.

Will this work?

Nope. Let's say our capacity is 100 kg and these are our two cakes:

  [(1, 30), (50, 200)]

With our approach, we’ll put in two of the second type of cake for a total value of 400 shillings. But we could have put in a hundred of the first type of cake, for a total value of 3000 shillings!

Just looking at the cake's values won’t work. Can we improve our approach?

Well, why didn’t it work?

We didn’t think about the weight! How can we factor that in?

What if instead of looking at the value of the cakes, we looked at their value/weight ratio? Here are our example cakes again:

  [(1, 30), (50, 200)]

The second cake has a higher value, but look at the value per kilogram.

The second type of cake is worth 4 shillings/kg (200/50200/50), but the first type of cake is worth 30 shillings/kg (30/130/1)!

Ok, can we just change our algorithm to use the highest value/weight ratio instead of the highest value? We know it would work in our example above, but try some more tests to be safe.

We might run into problems if the weight of the cake with the highest value/weight ratio doesn’t fit evenly into the capacity. Say we have these two cakes:

  [(3, 40), (5, 70)]

If our capacity is 8 kg, no problem. Our algorithm chooses one of each cake, giving us a haul worth 110 shillings, which is optimal.

But if the capacity is 9 kg, we're in trouble. Our algorithm will again choose one of each cake, for a total value of 110 shillings. But the actual optimal value is 120 shillings—three of the first type of cake!

So even looking at the value/weight ratios doesn’t always give us the optimal answer!

Let’s step back. How can we ensure we get the optimal value we can carry?

Try thinking small. How can we calculate the maximum value for a duffel bag with a weight capacity of 1 kg? (Remember, all our weights and values are integers.)

If the capacity is 1 kg, we’ll only care about cakes that weigh 1 kg (for simplicity, let's ignore zeroes for now). And we'd just want the one with the highest value.

We could go through every cake, using a greedy

A greedy algorithm builds up a solution by choosing the option that looks the best at every step.

Say you're a cashier and need to give someone 67 cents (US) using as few coins as possible. How would you do it?

Whenever picking which coin to use, you'd take the highest-value coin you could. A quarter, another quarter, then a dime, a nickel, and finally two pennies. That's a greedy algorithm, because you're always greedily choosing the coin that covers the biggest portion of the remaining amount.

Some other places where a greedy algorithm gets you the best solution:

  • Trying to fit as many overlapping meetings as possible in a conference room? At each step, schedule the meeting that ends earliest.
  • Looking for a minimum spanning tree in a graph? At each step, greedily pick the cheapest edge that reaches a new vertex.

Careful: sometimes a greedy algorithm doesn't give you an optimal solution:

Validating that a greedy strategy always gets the best answer is tricky. Either prove that the answer produced by the greedy algorithm is as good as an optimal answer, or run through a rigorous set of test cases to convince your interviewer (and yourself) that its correct.

approach to keep track of the max value we’ve seen so far.

Here’s an example solution:

  def max_duffel_bag_value_with_capacity_1(cake_tuples):
    max_value_at_capacity_1 = 0

    for cake_weight, cake_value in cake_tuples:
        if cake_weight == 1:
            max_value_at_capacity_1 = max(max_value_at_capacity_1, cake_value)

    return max_value_at_capacity_1

Ok, now what if the capacity is 2 kg? We’ll need to be a bit more clever.

It’s pretty similar. Again we’ll track a max value, let’s say with a variable max_value_at_capacity_2. But now we care about cakes that weigh 1 or 2 kg. What do we do with each cake? And keep in mind, we can lean on the code we used to get the max value at weight capacity 1 kg.

  1. If the cake weighs 2 kg, it would fill up our whole capacity if we just took one. So we just need to see if the cake's value is higher than our current max_value_at_capacity_2.
  2. If the cake weighs 1 kg, we could take one, and we'd still have 1 kg of capacity left. How do we know the best way to fill that extra capacity? We can use the max value at capacity 1. We’ll see if adding the cake's value to the max value at capacity 1 is better than our current max_value_at_capacity_2.

Does this apply more generally? If we can use the max value at capacity 1 to get the max value at capacity 2, can we use the max values at capacity 1 and 2 to get the max value at capacity 3?

Looks like this problem might have overlapping subproblems!

A problem has overlapping subproblems if finding its solution involves solving the same subproblem multiple times.

As an example, let's look at the Fibonacci sequence (the series where each number is the sum of the two previous ones—0, 1, 1, 2, 3, 5, 8, ...).

If we wanted to compute the nnth Fibonacci number, we could use this simple recursive algorithm:

  def fib(n):
    if n == 0 or n == 1:
        return n
    return fib(n - 1) + fib(n - 2)

We'd call fib(n - 1) and fib(n - 2) subproblems of fib(n).

Now let's look at what happens when we call fib(5):

A binary tree showing the recursive calls of calling fib of 5. Every fib of n call calls fib of n minus 1 and fib of n minus 2. So calling fib of 5 calls fib of 4 and fib of 3, which keep calling fib of lower numbers until reaching the base cases fib of 1 or fib of 0.

Our function ends up recursively calling fib(2) three times. So the problem of finding the nnth Fibonacci number has overlapping subproblems.

Let's see if we can build up to the given weight capacity, one capacity at a time, using the max values from previous capacities. How can we do this?

Well, let’s try one more weight capacity by hand—3 kg. So we already know the max values at capacities 1 and 2. And just like we did with max_value_at_capacity_1 and max_value_at_capacity_2, now we’ll track max_value_at_capacity_3 and loop through every cake:

  max_value_at_capacity_3 = 0

for cake_weight, cake_value in cake_tuples:
    # Only care about cakes that weigh 3 kg or less
    ...

What do we do for each cake?

If the current cake weighs 3 kg, easy—we see if it’s more valuable than our current max_value_at_capacity_3.

What if the current cake weighs 2 kg?

Well, let's see what our max value would be if we used the cake. How can we calculate that?

If we include the current cake, we can only carry 1 more kilogram. What would be the max value we can carry?

We already know the max_value_at_capacity_1! We can just add that to the current cake’s value!

Now we can see which is higher—our current max_value_at_capacity_3, or the new max value if we use the cake:

  max_value_using_cake    = max_value_at_capacity_1 + cake_value
max_value_at_capacity_3 = max(max_value_at_capacity_3, max_value_using_cake)

Finally, what if the current cake weighs 1 kg?

Basically the same as if it weighs 2 kg:

  max_value_using_cake    = max_value_at_capacity_2 + cake_value
max_value_at_capacity_3 = max(max_value_at_capacity_3, max_value_using_cake)

There’s gotta be a pattern here. We can keep building up to higher and higher capacities until we reach our input capacity. Because the max value we can carry at each capacity is calculated using the max values at previous capacities, we'll need to solve the max value for every capacity from 0 up to our duffel bag's actual weight capacity.

Can we write a function to handle all the capacities?

To start, we'll need a way to store and update all the max monetary values for each capacity.

We could use a dictionary,

Quick reference

Average Worst Case
space O(n)O(n) O(n)O(n)
insert O(1)O(1) O(n)O(n)
lookup O(1)O(1) O(n)O(n)
delete O(1)O(1) O(n)O(n)

A hash table organizes data so you can quickly look up values for a given key.

Strengths:

  • Fast lookups. Lookups take O(1)O(1) time on average.
  • Flexible keys. Most data types can be used for keys, as long as they're hashable.

Weaknesses:

  • Slow worst-case lookups. Lookups take O(n)O(n) time in the worst case.
  • Unordered. Keys aren't stored in a special order. If you're looking for the smallest key, the largest key, or all the keys in a range, you'll need to look through every key to find it.
  • Single-directional lookups. While you can look up the value for a given key in O(1)O(1) time, looking up the keys for a given value requires looping through the whole dataset—O(n)O(n) time.
  • Not cache-friendly. Many hash table implementations use linked lists, which don't put data next to each other in memory.

In Python 3.6

In Python 3.6, hash tables are called dictionaries.

  light_bulb_to_hours_of_light = {
    'incandescent': 1200,
    'compact fluorescent': 10000,
    'LED': 50000,
}

Hash maps are built on arrays

Arrays are pretty similar to hash maps already. Arrays let you quickly look up the value for a given "key" . . . except the keys are called "indices," and we don't get to pick them—they're always sequential integers (0, 1, 2, 3, etc).

Think of a hash map as a "hack" on top of an array to let us use flexible keys instead of being stuck with sequential integer "indices."

All we need is a function to convert a key into an array index (an integer). That function is called a hashing function.

A blank array except for a 20, labeled as the value, stored at index 9. To the left the array is the word "lies," labeled as the key, with an arrow pointing to the right at diamond with a question mark in the middle, labeled as the hashing function. The diamond points to the 9th index of the array.

To look up the value for a given key, we just run the key through our hashing function to get the index to go to in our underlying array to grab the value.

How does that hashing function work? There are a few different approaches, and they can get pretty complicated. But here's a simple proof of concept:

Grab the number value for each character and add those up.

The word "lies" in quotes. Arrows point from each character down to their corresponding number values, which are separated by plus signs and shown in sum to equal 429.

The result is 429. But what if we only have 30 slots in our array? We'll use a common trick for forcing a number into a specific range: the modulus operator (%). Modding our sum by 30 ensures we get a whole number that's less than 30 (and at least 0):

429%30=9 429 \: \% \: 30 = 9

The hashing functions used in modern systems get pretty complicated—the one we used here is a simplified example.

Hash collisions

What if two keys hash to the same index in our array? In our example above, look at "lies" and "foes":

The word "lies" in quotes and the word "foes" in quotes. Arrows point from the characters of each word to their corresponding number values. The sum of the characters of both words is shown to equal 429.

They both sum up to 429! So of course they'll have the same answer when we mod by 30:

429%30=9 429 \: \% \: 30 = 9

This is called a hash collision. There are a few different strategies for dealing with them.

Here's a common one: instead of storing the actual values in our array, let's have each array slot hold a pointer to a linked list holding the values for all the keys that hash to that index:

An array storing pointers. The pointer at index 9 has an arrow pointing to a linked list to the right of the array. Each linked list node now stores the word as well as its count and a pointer.

Notice that we included the keys as well as the values in each linked list node. Otherwise we wouldn't know which key was for which value!

There are other ways to deal with hash collisions. This is just one of them.

When hash table operations cost O(n)O(n) time

Hash collisions

If all our keys caused hash collisions, we'd be at risk of having to walk through all of our values for a single lookup (in the example above, we'd have one big linked list). This is unlikely, but it could happen. That's the worst case.

Dynamic array resizing

Suppose we keep adding more items to our hash map. As the number of keys and values in our hash map exceeds the number of indices in the underlying array, hash collisions become inevitable.

To mitigate this, we could expand our underlying array whenever things start to get crowded. That requires allocating a larger array and rehashing all of our existing keys to figure out their new position—O(n)O(n) time.

Sets

A set is like a hash map except it only stores keys, without values.

Sets often come up when we're tracking groups of items—nodes we've visited in a graph, characters we've seen in a string, or colors used by neighboring nodes. Usually, we're interested in whether something is in a set or not.

Sets are usually implemented very similarly to hash maps—using hashing to index into an array—but they don't have to worry about storing values alongside keys. In Python, the set implementation is largely copied from the dictionary implementation.

  light_bulbs = set()

light_bulbs.add('incandescent')
light_bulbs.add('compact fluorescent')
light_bulbs.add('LED')

'LED' in light_bulbs  # True
'halogen' in light_bulbs  # False
where the keys represent capacities and the values represent the max possible monetary values at those capacities. Dictionaries are built on arrays,

Quick reference

Worst Case
space O(n)O(n)
lookup O(1)O(1)
append O(1)O(1)
insert O(n)O(n)
delete O(n)O(n)

An array organizes items sequentially, one after another in memory.

Each position in the array has an index, starting at 0.

Strengths:

  • Fast lookups. Retrieving the element at a given index takes O(1)O(1) time, regardless of the length of the array.
  • Fast appends. Adding a new element at the end of the array takes O(1)O(1) time, if the array has space.

Weaknesses:

  • Fixed size. You need to specify how many elements you're going to store in your array ahead of time. (Unless you're using a fancy dynamic array.)
  • Costly inserts and deletes. You have to "scoot over" the other elements to fill in or close gaps, which takes worst-case O(n)O(n) time.

In Python 3.6

Some languages (including Python) don't have these bare-bones arrays.

Here's what arrays look like in Java.

  // instantiate an array that holds 10 integers
int gasPrices[] = new int[10];

gasPrices[0] = 346;
gasPrices[1] = 360;
gasPrices[2] = 354;
Java

Inserting

If we want to insert something into an array, first we have to make space by "scooting over" everything starting at the index we're inserting into:

An array of letters. From top to bottom, the values in the array are A, B, C, E, F, and G. The letter D is being inserted at the position of E, and the letters E, F, and G are each shown "scooting over" one index up to make room.

In the worst case we're inserting into the 0th index in the array (prepending), so we have to "scoot over" everything in the array. That's O(n)O(n) time.

Deleting

Array elements are stored adjacent to each other. So when we remove an element, we have to fill in the gap—"scooting over" all the elements that were after it:

Another array of letters. From top to bottom, the values in the array are A, B, C, Z, D, E, and F. The letter Z is being deleted, and the letters D, E, and F are each shown "scooting over" one index down to fill the space created by the deletion.

In the worst case we're deleting the 0th item in the array, so we have to "scoot over" everything else in the array. That's O(n)O(n) time.

Why not just leave the gap? Because the quick lookup power of arrays depends on everything being sequential and uninterrupted. This lets us predict exactly how far from the start of the array the 138th or 9,203rd item is. If there are gaps, we can no longer predict exactly where each array item will be.

Data structures built on arrays

Arrays are the building blocks for lots of other, more complex data structures.

Don't want to specify the size of your array ahead of time? One option: use a dynamic array.

Want to look up items by something other than an index? Use a dictionary.

so we can save some overhead by just using a list.

  def max_duffel_bag_value(cake_tuples, weight_capacity):
    # List to hold the maximum possible value at every
    # integer capacity from 0 to weight_capacity
    # starting each index with value 0
    max_values_at_capacities = [0] * (weight_capacity + 1)

What do we do next?

We’ll need to work with every capacity up to the input weight capacity. That’s an easy loop:

  # Every integer from 0 to the input weight_capacity
for current_capacity in range(weight_capacity + 1):
    ...

What will we do inside this loop? This is where it gets a little tricky.

We care about any cakes that weigh the current capacity or less. Let's try putting each cake in the bag and seeing how valuable of a haul we could fit from there.

So we'll write a loop through all the cakes (ignoring cakes that are too heavy to fit):

  for cake_weight, cake_value in cake_tuples:
    # If the cake weighs as much or less than the current capacity
    # see what our max value could be if we took it!
    if cake_weight <= current_capacity:
        # Find max_value_using_cake
        ...

And put it in our function body so far:

  def max_duffel_bag_value(cake_tuples, weight_capacity):
    # We make a list to hold the maximum possible value at every
    # duffel bag weight capacity from 0 to weight_capacity
    # starting each index with value 0
    max_values_at_capacities = [0] * (weight_capacity + 1)

    for current_capacity in range(weight_capacity + 1):

        for cake_weight, cake_value in cake_tuples:
            # If the cake weighs as much or less than the current capacity
            # see what our max value could be if we took it!
            if cake_weight <= current_capacity:
                # Find max_value_using_cake
                ...

How do we compute max_value_using_cake?

Remember when we were calculating the max value at capacity 3kg and we "hard-coded" the max_value_using_cake for cakes that weigh 3 kg, 2kg, and 1kg?

  # cake weighs 3 kg
max_value_using_cake = cake_value

# cake weighs 2 kg
max_value_using_cake = max_value_at_capacity_1 + cake_value

# cake weighs 1 kg
max_value_using_cake = max_value_at_capacity_2 + cake_value

How can we generalize this? With our new function body, look at the variables we have in scope:

  1. max_values_at_capacities
  2. current_capacity
  3. cake_weight
  4. cake_value

Can we use these to get max_value_using_cake for any cake?

Well, let's figure out how much space would be left in the duffel bag after putting the cake in:

  remaining_capacity_after_taking_cake = current_capacity - cake_weight

So max_value_using_cake is:

  1. the current cake's value, plus
  2. the best value we can fill the remaining_capacity_after_taking_cake with
  remaining_capacity_after_taking_cake = current_capacity - cake_weight
max_value_using_cake = cake_value + max_values_at_capacities[remaining_capacity_after_taking_cake]

We can squish this into one line:

  max_value_using_cake = cake_value + max_values_at_capacities[current_capacity - cake_weight]

Since remaining_capacity_after_taking_cake is a lower capacity, we'll have always already computed its max value and stored it in our max_values_at_capacities!

Now that we know the max value if we include the cake, should we include it? How do we know?

Let's allocate a variable current_max_value that holds the highest value we can carry at the current capacity. We can start it at zero, and as we go through all the cakes, any time the value using a cake is higher than current_max_value, we'll update current_max_value!

  current_max_value = max(max_value_using_cake, current_max_value)

What do we do with each value for current_max_value? What do we need to do for each capacity when we finish looping through all the cakes?

We save each current_max_value in the max_values_at_capacities list. We'll also need to make sure we set current_max_value to zero in the right place in our loops—we want it to reset every time we start a new capacity.

So here's our function so far:

  def max_duffel_bag_value(cake_tuples, weight_capacity):
    # We make a list to hold the maximum possible value at every
    # duffel bag weight capacity from 0 to weight_capacity
    # starting each index with value 0
    max_values_at_capacities = [0] * (weight_capacity + 1)

    for current_capacity in range(weight_capacity + 1):
        # Set a variable to hold the max monetary value so far for
        # the current weight capacity
        current_max_value = 0

        for cake_weight, cake_value in cake_tuples:
            # If the current cake weighs as much or less than the
            # current weight capacity it's possible taking the cake
            # would get a better value
            if cake_weight <= current_capacity:

                # Should we use the cake or not?
                # If we use the cake, the most kilograms we can include in
                # addition to the cake we're adding is the current capacity
                # minus the cake's weight. We find the max value at that
                # integer capacity in our list max_values_at_capacities
                max_value_using_cake = (
                    cake_value
                    + max_values_at_capacities[current_capacity - cake_weight]
                )

                # Now we see if it's worth taking the cake. How does the
                # value with the cake compare to the current_max_value?
                current_max_value = max(max_value_using_cake,
                                        current_max_value)

        # Add each capacity's max value to our list so we can use them
        # when calculating all the remaining capacities
        max_values_at_capacities[current_capacity] = current_max_value

Looking good! But what's our final answer?

Our final answer is max_values_at_capacities[weight_capacity]!

Okay, this seems complete. What about edge cases?

Remember, weights and values can be any non-negative integer. What about zeroes? How can we handle duffel bags that can’t hold anything and cakes that weigh nothing?

Well, if our duffel bag can’t hold anything, we can just return 0. And if a cake weighs 0 kg, we return infinity. Right?

Not that simple!

What if our duffel bag holds 0 kg, and we have a cake that weighs 0 kg. What do we return?

And what if we have a cake that weighs 0 kg, but its value is also 0. If we have other cakes with positive weights and values, what do we return?

If a cake’s weight and value are both 0, it’s reasonable to not have that cake affect what we return at all.

If we have a cake that weighs 0 kg and has a positive value, it’s reasonable to return infinity, even if the capacity is 0.

For returning infinity, we have several choices. We could return:

  1. Python 3.6's float('inf').
  2. Return a custom response, like the string 'infinity'.
  3. Raise an exception indicating the answer is infinity.

What are the advantages and disadvantages of each option?

For the first option the advantage is we get the behavior of infinity. Compared to any other integer, float('inf') will be greater. And it's a number, which can be an advantage or disadvantage—we might want our result to always be the same type, but without manually checking we won't know if we mean an actual value or the special case of infinity.

For the second option the advantage is we can create a custom behavior that we—or our function's users—could know to expect. The disadvantage is we'd have to explicitly check for that behavior, otherwise we might end up trying to parse the string "infinity" as an integer, which could give us an error or (perhaps worse) a random number. In a production system, a function that sometimes returns an integer and sometimes returns a string would probably be seen as sloppy.

The third option is a good choice if we decide infinity is usually an "unacceptable" answer. For example, we might decide an infinite answer means we've probably entered our inputs wrong. Then, if we really wanted to "accept" an infinite answer, we could always "catch" this exception when we call our function.

Any option could be reasonable. We'll go with the first one here.

Solution

This is a classic computer science puzzle called "the unbounded knapsack problem."

We use a bottom-up

Going bottom-up is a way to avoid recursion, saving the memory cost that recursion incurs when it builds up the call stack.

Put simply, a bottom-up algorithm "starts from the beginning," while a recursive algorithm often "starts from the end and works backwards."

For example, if we wanted to multiply all the numbers in the range 1..n1..n, we could use this cute, top-down, recursive one-liner:

  def product_1_to_n(n):
    # We assume n >= 1
    return n * product_1_to_n(n - 1) if n > 1 else 1

This approach has a problem: it builds up a call stack of size O(n)O(n), which makes our total memory cost O(n)O(n). This makes it vulnerable to a stack overflow error, where the call stack gets too big and runs out of space.

To avoid this, we can instead go bottom-up:

  def product_1_to_n(n):
    # We assume n >= 1
    result = 1
    for num in range(1, n + 1):
        result *= num

    return result

This approach uses O(1)O(1) space (O(n)O(n) time).

Some compilers and interpreters will do what's called tail call optimization (TCO), where it can optimize some recursive functions to avoid building up a tall call stack. Python and Java decidedly do not use TCO. Some Ruby implementations do, but most don't. Some C implementations do, and the JavaScript spec recently allowed TCO. Scheme is one of the few languages that guarantee TCO in all implementations. In general, best not to assume your compiler/interpreter will do this work for you.

Going bottom-up is a common strategy for dynamic programming problems, which are problems where the solution is composed of solutions to the same problem with smaller inputs (as with multiplying the numbers 1..n1..n, above). The other common strategy for dynamic programming problems is memoization.

approach to find the max value at our duffel bag's weight_capacity by finding the max value at every capacity from 0 to weight_capacity.

We allocate a list max_values_at_capacities where the indices are capacities and each value is the max value at that capacity.

For each capacity, we want to know the max monetary value we can carry. To figure that out, we go through each cake, checking to see if we should take that cake.

The best monetary value we can get if we take a given cake is simply:

  1. that cake's value, plus
  2. the best monetary value we can carry in our remaining duffel bag capacity after taking the cake—which we'll already have stored in max_values_at_capacities

To handle weights and values of zero, we return infinity only if a cake weighs nothing and has a positive value.

  def max_duffel_bag_value(cake_tuples, weight_capacity):
    # We make a list to hold the maximum possible value at every
    # duffel bag weight capacity from 0 to weight_capacity
    # starting each index with value 0
    max_values_at_capacities = [0] * (weight_capacity + 1)

    for current_capacity in range(weight_capacity + 1):
        # Set a variable to hold the max monetary value so far
        # for current_capacity
        current_max_value = 0

        for cake_weight, cake_value in cake_tuples:
            # If a cake weighs 0 and has a positive value the value of
            # our duffel bag is infinite!
            if cake_weight == 0 and cake_value != 0:
                return float('inf')

            # If the current cake weighs as much or less than the
            # current weight capacity it's possible taking the cake
            # would get a better value
            if cake_weight <= current_capacity:

                # So we check: should we use the cake or not?
                # If we use the cake, the most kilograms we can include in
                # addition to the cake we're adding is the current capacity
                # minus the cake's weight. We find the max value at that
                # integer capacity in our list max_values_at_capacities
                max_value_using_cake = (
                    cake_value
                    + max_values_at_capacities[current_capacity - cake_weight]
                )

                # Now we see if it's worth taking the cake. how does the
                # value with the cake compare to the current_max_value?
                current_max_value = max(max_value_using_cake,
                                        current_max_value)

        # Add each capacity's max value to our list so we can use them
        # when calculating all the remaining capacities
        max_values_at_capacities[current_capacity] = current_max_value

    return max_values_at_capacities[weight_capacity]

Complexity

O(nk)O(n*k) time, and O(k)O(k) space, where nn is number of types of cake and kk is the capacity of the duffel bag. We loop through each cake (nn cakes) for every capacity (kk capacities), so our runtime is O(nk)O(n*k), and maintaining the list of k+1k+1 capacities gives us the O(k)O(k) space.

Congratulations! Because of dynamic programming, you have successfully stolen the Queen's cakes and made it big.

Keep in mind: in some cases, it might not be worth using our optimal dynamic programming solution. It's a pretty slow algorithm—without any context (not knowing how many cake types we have, what our weight capacity is, or just how they compare) it's easy to see O(nk)O(n*k) growing out of control quickly if nn or kk is large.

If we cared about time, like if there was an alarm in the vault and we had to move quickly, it might be worth using a faster algorithm that gives us a good answer, even if it's not always the optimal answer. Some of our first ideas in the breakdown were to look at cake values or value/weight ratios. Those algorithms would probably be faster, taking O(nlgn)O(n\lg{n}) time (we'd have to start by sorting the input).

Sometimes an efficient, good answer might be more practical than an inefficient, optimal answer.

Bonus

  1. We know the max value we can carry, but which cakes should we take, and how many? Try adjusting your answer to return this information as well.
  2. What if we check to see if all the cake weights have a common denominator? Can we improve our algorithm?
  3. A cake that's both heavier and worth less than another cake would never be in the optimal solution. This idea is called dominance relations. Can you apply this idea to save some time? Hint: dominance relations can apply to sets of cakes, not just individual cakes.
  4. What if we had a tuple for every individual cake instead of types of cakes? So now there's not an unlimited supply of a type of cake—there's exactly one of each. This is a similar but harder problem, known as the 0/1 Knapsack problem.

What We Learned

This question is our spin on the famous "unbounded knapsack problem"—a classic dynamic programming question.

If you're struggling with dynamic programming, we have reference pages for the two main dynamic programming strategies: memoization and going bottom-up.

Do you have an answer?

Wanna review this one again later? Or do you feel like you got it all?

Mark as done Pin for review later
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
import unittest
def max_duffel_bag_value(cake_tuples, weight_capacity):
# Calculate the maximum value we can carry
return -1
# Tests
class Test(unittest.TestCase):
def test_one_cake(self):
actual = max_duffel_bag_value([(2, 1)], 9)
expected = 4
self.assertEqual(actual, expected)
def test_two_cakes(self):
actual = max_duffel_bag_value([(4, 4), (5, 5)], 9)
expected = 9
self.assertEqual(actual, expected)
def test_only_take_less_valuable_cake(self):
actual = max_duffel_bag_value([(4, 4), (5, 5)], 12)
expected = 12
self.assertEqual(actual, expected)
def test_lots_of_cakes(self):
actual = max_duffel_bag_value([(2, 3), (3, 6), (5, 1), (6, 1), (7, 1), (8, 1)], 7)
expected = 12
self.assertEqual(actual, expected)
def test_value_to_weight_ratio_is_not_optimal(self):
actual = max_duffel_bag_value([(51, 52), (50, 50)], 100)
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

Reset editor

Powered by qualified.io

. . .